Testing for Proportionality of Multivariate Dispersion Structures Using Interdirections

نویسندگان

  • Sujit Kumar Ghosh
  • Debapriya Sengupta
  • Sujit K. Ghosh
چکیده

Knowing whether the dispersion structures of two elliptically symmetric populations are proportional is an important problem in multivariate data analysis. Since the problem is invariant under nonsingular transformations it is possible to reduce it to the situation where one population is spherically symmetric while the other has a diagonal dispersion structure. In this article we show that the problem is actually equivalent to testing uniformity of a distribution on the sphere in an appropriate euclidean space. The main purpose is to demonstrate how the idea of interdirections, introduced by Randles (1989) in the context of multivariate sign tests, can be adapted to handle this situation. The ndings of this article also enhance the possibility of developing a technology for robust multivariate data analysis using such fundamentally geometric concepts. AMS 1991 Subject Classi cation. primary 62H15 62G10, secondary 62F05, 62G20, 62F35. Key Words & Phrases. A ne Invariant Multivariate Sign Tests, Sphericity Tests, Elliptically Symmetric Distributions, Interdirections. 3 1 Introduction The intrinsic symmetry of many multivariate nonparametric testing problems permits us to develop tests which are distribution free over a large class of null hypotheses. The main advantage of such test procedures is that we can control the level of the test over a large class of null hypothesis, which is one of the primary concerns in hypothesis testing. Also in many problems of interest distribution free test procedures do have good power properties. Another appealing feature of these procedures is their inherent simplicity. The standard procedures in composite hypothesis testing problems typically consist of likelihood ratio (or uniformly most powerful invariant) tests for a suitable parametric subhypothesis of the original nonparametric hypothesis. In this paper we consider the following multivariate two sample dispersion structure problem. Suppose X1; ; Xn1 and Y1; ; Yn2 are two independent sets of random samples from two p-dimensional elliptically symmetric populations having densities given by j 1j 1 2f(xT 1 1 x) and j 2j 1 2 g(yT 1 2 y): respectively. We wish to test H0 : 2 / 1 (i.e 2 = 1 for some > 0 ) treating f and g as unknown nuisance parameters. Notice that we are using the term `dispersion structure' in order to include cases where the second moments may not exist. Even in that case the parameters 1 and 2 are well de ned for elliptically symmetric densities upto a constant multiple. In case the second moments are nite they are proportional to the dispersion matrices. In addition to this we shall also assume the following restriction on the sample sizes, namely n1 = k1n and n2 = k2n with 0 < k1; k2 < 1; k1 + k2 = 1. Moreover the same sampling fractions are maintained even as n tends to in nity. The situation where the populations have unknown centers say, 1 and 2 respectively is of more practical relevance. If we have a test procedure which is based on the assumption that 1 = 2 = 0 we can make an appropriate adjustment in the case when the centers are unknown, by rst subtracting suitable estimates of respective centers from the observations and then applying the same procedure on the centered data. By doing so we may alter the original distribution even asymptotically. It is not clear whether that is the case for the distribution free procedures we would like to consider in this article. We are able to present only some simulation evidence that the distributions are not altered much due to such precentering . The problem is worth pursuing from a theoretical level and probably the approach of Randles (1982) and de Wet & Randles (1987) will give some clue in this direction. When both f and g are standard normal densities, one can derive the likelihood ratio test (LRT) for the case when 1 = 2 under H0. There is a large volume of literature studying various properties of LRT in the multivariate normal situation. We refer to Anderson (1958) and Muirhead (1982) for this. The (modi ed) LRT is 4 given by Cn = det(An1)q1=2 det(An2)q2=2 det(An1 +An2)(q1+q2)=2 where An1 = Pn1 i=1 (Xi X)(Xi X)T and An1 = Pn2 i=1 (Yi Y )(Yi Y )T and q1 = n1 1; q2 = n2 1. We shall treat the test statistic Cn as the classical procedure for simulation purposes. To motivate our technique we describe a closely related problem which has been studied in great detail and where we borrow some key ideas from. Let X1; ; Xn be iid samples from a p-dimensional density which is elliptically symmetric about a point . We wish to test H0 : = 0. The classical test procedure is the Hotelling's T 2 statistic (cf. Anderson 1958) which is the LRT assuming the population is Np( ; ). When p = 1, the so called `sign test' is a well accepted distribution free procedure for this problem. In higher dimensions, several extensions of the concept of univariate sign test exist in the literature. Since the testing problem is invariant under nonsingular transformations it is worthwhile to consider a ne invariant extensions of sign tests to higher dimensions. Various interesting procedures (for example, Hodges 1955, Blumen 1958, Oja & Nyblom 1989) are available. Randles (1989), Peters & Randles (1990) introduced the concept of interdirections which leads to a comprehensive study of a large class of multivariate sign tests. See also, Chaudhuri & Sengupta (1993). The main ndings of the above studies can be summarized as follows. After a reduction by a ne invariance, the original problem reduces to testing whether the center of symmetry of a spherically symmetric distribution is zero. If one considers only distribution free procedures the problem reduces further to the problem of testing uniformity on the unit sphere S(p 1) in IRp. This is a very well studied problem and a large number of tests have been proposed especially when p = 2. Most of these procedures can be expressed as functions of angles between observations. As shown by Randles (1989) and Chaudhuri and Sengupta (1993) a large class of a ne invariant multivariate sign tests can be thought of as approximations to various well-known tests of uniformity on S(p 1) in large samples. From this point of view the interdirections are actually a ne invariant estimates of the angle between two observations with the special property that they are distribution free under H0. For the dispersion structure problem we shall follow the same route. By invariance we can assume without loss of generality that one of the populations is spherically symmetric and the other has a diagonal dispersion structure. Thus we get a reduced problem where we have, say, Z1; ; Zn iid samples from the density j j 1 2 g(zT 1z) where is diagonal and we want to test H0S : / Ip where Ip is the p p identity matrix. For this problem any distribution free procedure should be based on Z1=jjZ1jj; ; Zn=jjZnjj due to the rotational symmetry in the problem. Therefore 5 the problem (quite surprisingly!) reduces to testing uniformity on S(p 1). However one should note that the nature of the alternative is di erent in this case so that the test statistics turn out to be di erent from the location problem. Now we can consider any reasonable test for uniformity on S(p 1) and try to approximate it by a ne invariant procedures based on the original observations analogous to the location case. The organization of the paper goes as follows. In section 2 we describe the proposed test statistic. First the associated sphericity problem is considered. The actual test statistic proposed is an a ne invariant approximation to its twin in the sphericity case. In the next section we develop a Wald type nonparametric test statistics along with other concluding remarks. Finally in section 4 all the tests are compared on the basis of simulations for various values of nuisance parameters (i.e, f and g). The technical details are provided in the appendix. 2 Construction of the test statistic The construction of the test statistic will be described in two parts. As already mentioned by virtue of invariance we can reduce the original problem to a problem relating to testing uniformity on S(p 1). Next one can approximate an appropriate test statistic for the reduced problem by an (asymptotically equivalent) a ne invariant version which will work for the actual problem. 2.1 Sphericity tests Let Z1; ; Zn be iid sample from a p-dimensional elliptically symmetric density given by j j 1 2 g(zT 1z) with g and unknown. We want to test H0S : / Ip. Since the problem is invariant under orthogonal transformations one can assume that is diagonal without loss of generality. Also the elementary invariant quantities for constructing distribution free procedures are given by Ui = Zi=jjZijj; i = 1; : : : ; n which are iid uniform on S(p 1) under H0S . The normal theory likelihood ratio statistic is given by (Anderson 1958), Ln = jAnjn2 = tr(An) p ! pn 2 : (2:1) where An = Pni=1 ZiZT i , the uncorrelated sums of squares and product matrix. The limiting behavior of Wn = 2 logLn can be worked out (see Nagao & Srivastava 1973). It turns out that under elliptic symmetry Fact 2.1 (i) Under H0S , Wn has a limiting (1 + ) 2( p(p+1) 2 1) distribution where is the kurtosis of the density g. (ii) Under the sequence of local alternatives n = Ip+ 1 pndiag(c1; : : : ; cp), Wn has a limiting (1 + ) 2( p(p+1) 2 1)( 2) distribution with 2 = 1 1+ Ppi=1(ci c)2. 6 The above fact establishes that the level of the test based on Wn cannot be controlled even asymptotically underH0S unless we make some extra assumption regarding . In order to construct distribution free test procedures for H0S we look at the class of U -statistics of the form S n(h) =Xi 0; k = 1; ; p: where xk N(0; 1) and kXk2 2p. 12 Proof of Theorem 2.1 Let Ui = (ui1; ; uip)T ; i = 1; ; n; k = 1; ; p. Using elementary algebra we have S n n n p 2p = 8<:12 p X k=1 n 1=2 n Xi=1 tikk!2 +XkFor the rst part of the proof we shall work under H0; i:e; 2 / 1. However beforeproving the desired results rst let us observe the following property of degenerateU statistics. Let U1n =PPiPPidegenerate U statistics, i:e; E (hk(X1; X2)jX1) = 0 a.e for k = 1; 2 based on a set ofiid random vectors. Then E (U1n U2n)2 = n2E (h1(X1; X2) h2(X1; X2))2. Thisfollows because a direct expansion showsE (U1n U2n)2 =XXXXiNext observe that by assumption both the U -statistics are degenerate. Thus wheneverfi; jg 6= fk; lg the corresponding term can be shown to be zero by suitably condition-ing with respect to i; j and taking expectation with respect to any index belongingto fk; lg which does not belong to fi; jg. Hence the claim follows. Now consider Sn1 .Note that conditionally on the second sample Sn1 is a degenerate U -statistics. Alsoit follows from Randles (1989) that̂n1(i; j) converges in probability to the actualangle cos1(Xi; Xj) between the observationsXi andXj . In view of this we can claimthat cos2 ( ̂n1(i; j)) ̂n1(i) ̂n1(j) + ̂ n1 ! (UTi Uj)2 E (UT1 U2)2 in probability.Also in view of lemma 4.1 we have E (UT1 U2)2 = p 1. Next consider the quantityE fSn1 (S n1( 1=21 X)n12 p 1)g2. Taking the conditional expectation of the abovequantity given the second sample in view of the above stated property we getE2 fSn1 (S n1( 1=21 X)n12 p 1)g2 =n21 E2 fcos2 ̂n1(1; 2) ̂n1(1) ̂n1(2)+ ̂ n1 (UT1 U2)2 + p 1g2where E2 stands for the conditional expectation given the second sample. This provesthe assertion that E fSn1 (S n1( 1=21 X)n12 p 1)g2 = o(n2). A similar set ofarguments after swapping the roles of the rst and second sample will give E fSn2(S n2( 1=22 Y )n22 p 1)g2 = o(n2). Now in view of theorem 2.1 we haveE (n 1Sn + k1k2(p 1)pŜn)2 ! 0:(5:1)where Ŝn = k1k2 n n 11 Sn1( 1=21 X) n1 p2p + n 12 S n2( 1=22 Y ) n2 p2p oHence the-orem 2.2 follows. Next consider the situation under the local alternatives. Since thelocal alternatives considered are contiguous to the null by making use of the argu-ments given in Randles (1989) it can be shown that (5.1) is a su cient condition forsimilar approximation to hold under the sequence of local alternatives as well. Hencecorollary 2.3 follows from theorem 2.1 when applied separately to the pieces S n1 andS n2 respectively.14 REFERENCESAnderson, T. W. (1958), An introduction to Multivariate Statistical Analysis,New York : John Wiley & Sons.Arcones, M.A., and Gin e, E. (1992),\Limit Theorems for U Processes," TheAnnals of Probability, 21, 1494-1542.Baringhaus, L. (1991),\Testing for Spherical Symmetry of a Multivariate Distri-bution," The Annals of Statistics, 19, 899-917.Beran, R.J. (1968),\Asymptotic Theory of a Class of Tests of Uniformity of aCircular Distribution," Annals of Mathematical Statistics, 40, 1196-1206.Blumen, I. (1958),\A New Bivariate Sign Test," Journal of the American StatisticalAssociation, 53, 448-456.Chaudhuri, P., and Sengupta, D. (1993),\Sign Tests in Multidimension: Infer-ence Based on the Geometry of Data Cloud," Journal of the American StatisticalAssociation, 88, 1363-1370.De Wet, T., and Randles, R.H. (1987),\On the E ect of Substituting Param-eter Estimators in Limiting 2 U and V Statistics," The Annals of Statistics,15, 398-412.Gin e, E. (1975),\Invariant tests for uniformity on compact Remannian manifoldsbased on Sobolev norms, The Annals of Statistics, 3, 1243-1266.Gregory, G.G. (1980),\On e ciency and optimality of quadratic tests," The An-nals of Statistics, 8, 116-131.Hajek, J. and Sidak, Z. (1967), Theory of Rank Tests, New York: AcademicPress.Hodges, J.L. (1955),\A Bivariate Sign Test," Annals of Mathematical Statistics,26, 523-527.Mardia, K.V. (1972), Statistics of Directional Data, London: Academic Press.Muirhead R. J. (1982), Aspects of Multivariate Statistical Analysis, New York :John Wiley & Sons.Nagao, and Srivastava (1973) Manuscript.Oja, H., and Nyblom, J. (1989),\Bivariate Sign Tests," Journal of the AmericanStatistical Association, 84, 249-259.15 Peters, D., and Randles, R.H. (1990),\A Multivariate Signed Rank Test forthe One-Sample Location Problem," Journal of the American Statistical Asso-ciation, 85, 552-557.Randles, R.H. (1989),\A Distribution-Free Multivariate Sign Test Based on In-terdirections," Journal of American Statistical Association, 84, 1045-1050.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotic Linearity of Serial and Nonserial Multivariate Signed Rank Statistics

Asymptotic linearity plays a key role in estimation and testing in the presence of nuisance parameters. This property is established, in the very general context of a multivariate general linear model with elliptical VARMA errors, for the serial and nonserial multivariate rank statistics considered in Hallin and Paindaveine (2002a and b, 2004a) and Oja and Paindaveine (2004). These statistics, ...

متن کامل

Multivariate Signed Ranks : Randles’ Interdirections or Tyler’s Angles?

Hallin and Paindaveine (2002a) developed, for the multivariate (elliptically symmetric) one-sample location problem, a class of optimal procedures, based on Randles’ interdirections and the ranks of pseudo-Mahalanobis distances. We present an alternative version of these procedures in which interdirections are replaced by “Tyler angles”, namely, the angles between the observations standardized ...

متن کامل

Optimal signed-rank tests based on hyperplanes

For analysing k-variate data sets, Randles (1989) considered hyperplanes going through k− 1 data points and the origin. He then introduced an empirical angular distance between two k-variate data vectors based on the number of hyperplanes (the so-called interdirections) that separate these two points, and proposed a multivariate sign test based on those interdirections. In this paper, we presen...

متن کامل

On Liu's simplicial depth and Randles' interdirections

At about the same time (approximately 1989), R. Liu introduced the notion of simplicial depth and R. Randles the notion of interdirections. These completely independent and seemingly unrelated initiatives, serving different purposes in nonparametric multivariate analysis, have spawned significant activity within their quite different respective domains. A surprising and fruitful connection betw...

متن کامل

Modelling of Correlated Ordinal Responses, by Using Multivariate Skew Probit with Different Types of Variance Covariance Structures

In this paper, a multivariate fundamental skew probit (MFSP) model is used to model correlated ordinal responses which are constructed from the multivariate fundamental skew normal (MFSN) distribution originate to the greater flexibility of MFSN. To achieve an appropriate VC structure for reaching reliable statistical inferences, many types of variance covariance (VC) structures are considered ...

متن کامل

Inferences on the Generalized Variance under Normality

Generalized variance is applied for determination of dispersion in a multivariate population and is a successful measure for concentration of multivariate data. In this article, we consider constructing confidence interval and testing the hypotheses about generalized variance in a multivariate normal distribution and give a computational approach. Simulation studies are performed to compare thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007